home *** CD-ROM | disk | FTP | other *** search
-
-
-
- 219
-
- CHAPTER 21 - .COM FILES
-
- All the programs that we have made so far have been .EXE files.
- That means that the file extension has always been .EXE after
- linking. When you have and .EXE file, the program loader makes
- certain adjustments to the machine code at run time. These
- adjustments are the actual segment addresses of the segments.
-
- There is another type of executable file, and that is a .COM
- file. When the loader puts a .COM file into memory, it makes no
- adjustments, it simply reads the file directly from disk into
- memory. Therefore, a .COM file loads faster. However, there is a
- restriction:
-
- All code, data, and the stack must be in a single segment.
- This effectively limits a .COM program to 65536 bytes of
- code+data+stack. {1}
-
- In general, it is easier for program development to keep code and
- data separate, but we can mix them together. Let's look at the
- template for a .COM file. It is called COMTEMP.ASM.
-
- ; com file template
- ; put name here
- ; * * * * * * * * * * * * * * *
- INCLUDE \PUSHREGS.MAC
-
- COMSEG SEGMENT PUBLIC 'CODE'
-
- ASSUME cs:COMSEG, ds:COMSEG, es:COMSEG, ss:COMSEG
- ; - - - - - - - - - -
- main proc NEAR
-
- ORG 100h
- start:
-
- ; - - - - START CODE BELOW THIS LINE
-
- ; - - - - END CODE ABOVE THIS LINE
-
- ret
-
- main endp
-
- ; - - - - START SUBROUTINES BELOW THIS LINE
-
- ; - - - - END SUBROUTINES ABOVE THIS LINE
-
- ____________________
-
- 1. Actually, it is possible to get around this restriction
- with suitable fiddling, but by that time you have made the
- program much more complicated so you have lost any advantage that
- you had by not using an .EXE file.
-
- ______________________
-
- The PC Assembler Tutor - Copyright (C) 1989 Chuck Nelson
-
-
-
-
- The PC Assembler Tutor 220
- ______________________
-
-
- ; - - - - START DATA BELOW THIS LINE
-
- ; - - - - END DATA ABOVE THIS LINE
-
- Stack_area dw 500 dup (?)
-
- COMSEG ENDS
- ; * * * * * * * * * * * * * * *
- END start
-
- We will take the things in order. First, there is the line:
-
- ORG 100h
-
- This effectively tells the assembler to put 100h (256d) bytes of
- zeros at the beginning. Also notice that the setup code that we
- had in a .EXE file is missing; we start with the code
- immediately. This all has to do with the PSP (program segment
- prefix).{2}
-
-
- PSP
-
- You will remember from the chapter describing the .EXE template
- file that we wrote:
-
- push ds
- sub ax, ax
- push ax
-
- because upon entry to an .EXE file, DS contains the segment
- address of the PSP, and at offset 0000 (that is, the first byte
- of the segment) there is a machine instruction for an orderly
- exit from the program. In an .EXE file, the PSP is somewhere in
- memory, put there by the loader. In a .COM file, the PSP is the
- first 100h (256d) bytes of the segment. The loader fills in the
- PSP and then reads the file directly from disk. That means that
- the machine instruction for an orderly exit is at 0000 of the
- current segment. Notice that we have a NEAR procedure. A .COM
- file has a near return which will stay in the same segment.
- Normally we would have to write:
-
- sub ax, ax
- push ax
-
- to put the return address 0000 on the stack, but for a .COM file
- the loader pushes 0000 on the stack before giving control to the
- program. Why the loader provides this service for a .COM file but
- not for an .EXE file is a mystery. In any case, with a .COM file,
- you don't need to push the return address on the stack, since
- it's there already.
- ____________________
-
- 2. If you want to know what the PSP (program segment prefix)
- is exactly, consult either of the two books on hardware and
- interrupts. The PSP is always exactly 256 bytes long.
-
-
-
-
- Chapter 21 - .COM Files 221
- _______________________
-
-
- We have the assembler directive:
-
- ASSUME cs:COMSEG, ds:COMSEG, es:COMSEG, ss:COMSEG
-
- The reason for including all the segments is (1) the loader
- actually loads the segment address into all four segments and (2)
- the assembler will use the natural segment for all instructions.
- The BP instructions will refer to SS, the data instructions will
- refer to DS, the string move instructions (SCAS, CMPS, MOVS) will
- refer to ES. That means that the assembler will not use segment
- overrides. This helps avoid something called phase errors which
- will be explained at the end.
-
-
- Right after the ORG instruction is start: (the first instruction
- executed in the program.)
-
- ORG 100h ; 256d
- start:
-
- This is inflexible. After reading the program into memory, the
- loader ALWAYS starts the program at 100h. All .COM files start
- execution at 100h. Period.
-
- There is space for subroutines, space for data, and finally space
- for the stack. The order of subroutines and data can be changed.
- The stack space is technically not necessary, but it is a good
- reminder to leave it there. All .COM files take up a whole 65536
- byte segment in memory, no matter how short they are.{3} The
- stack is at the very end of the segment, so if your program is
- 200 bytes long, you have 65336 bytes of stack space.
-
- Let's make a simple program. It is the famous "Hello,World"
- program.
-
- ; - - - - START CODE BELOW THIS LINE
- mov dx, offset mr_happy_face ; int 21h, function 9
- mov ah, 9
- int 21h
- ; - - - - END CODE ABOVE THIS LINE
-
- ; - - - - START DATA BELOW THIS LINE
- mr_happy_face db "Hello, world!", 13, 10, "$"
- ; - - - - END DATA ABOVE THIS LINE
-
- This program prints 'Hello World!' and a new line. The dollar
- sign signifies the end of the string for this interrupt. It is
- simple enough, and it gives us the chance to look at the extra
- step needed to make a .COM file. Assemble it, and link it. When
- you link it, you will get a warning that there is no stack
- segment. For .COM files, this warning is unimportant. We now have
- an .EXE file (which, by the way, won't run correctly). How do we
- ____________________
-
- 3. This is a slightly abridged explaination. For the real
- details, consult Microsoft's "The MS-DOS Encyclopedia".
-
-
-
-
- The PC Assembler Tutor 222
- ______________________
-
- make it a .COM file?
-
- Among the programs that you got with DOS is one called EXE2BIN.
- It takes an .EXE file, and if possible, converts it into a .COM
- file. You simply write the name of the file you want converted
- and the name of the converted file. Both of these names must have
- the full file extension:
-
- exe2bin programA.exe programA.com
-
- You will now have programA.com as a .COM file. If we now write:
-
- C> programA
-
- Will the loader load the .COM file or the .EXE file? The DOS
- order for execution is .COM files first, then .EXE files, then
- .BAT files. DOS will execute the .COM file. Try it.
-
- That was pretty easy. Now, as a technical exercise, we are going
- to link together three different files. Here's the first file:
-
- ; prog1.asm
- ; * * * * * * * * * * * * * * *
- INCLUDE \PUSHREGS.MAC
- COMSEG SEGMENT PUBLIC 'CODE'
-
- ASSUME cs:COMSEG, ds:COMSEG, es:COMSEG, ss:COMSEG
-
- PUBLIC message1
- EXTRN subroutine_a:NEAR, message3:BYTE
- ; - - - - - - - - - -
- main proc NEAR
-
- ORG 100h
- start:
- mov ah, 9 ; int 21h, ah = 9
- mov dx, offset message1
- int 21h
-
- mov ah, 9 ; int 21h, ah = 9
- mov dx, offset message3
- int 21h
-
- call subroutine_a
- ret
-
- main endp
-
- message1 db "This is from the main program.", 13, 10, "$"
-
- COMSEG ENDS
- ; * * * * * * * * * * * * * * *
- END start
- ; ----------
-
- Here is the second file:
- ; prog2.asm
-
-
-
-
- Chapter 21 - .COM Files 223
- _______________________
-
- ; * * * * * * * * * * * * * * *
- INCLUDE \PUSHREGS.MAC
- COMSEG SEGMENT PUBLIC 'CODE'
-
- ASSUME cs:COMSEG, ds:COMSEG, es:COMSEG, ss:COMSEG
-
- PUBLIC message2, subroutine_a
- EXTRN subroutine_b:NEAR, message3:BYTE
- ; - - - - - - - - - -
- subroutine_a proc NEAR
-
- PUSHREGS ax, dx
- mov ah, 9 ; int 21h, ah = 9
- mov dx, offset message2
- int 21h
-
- mov ah, 9 ; int 21h, ah = 9
- mov dx, offset message3
- int 21h
-
- call subroutine_b
- POPREGS ax, dx
- ret
-
- subroutine_a endp
-
- message2 db "This is from subroutine A.", 13, 10, "$"
-
- COMSEG ENDS
- ; * * * * * * * * * * * * * * *
- END
- ; ----------
-
- And here is the third file:
-
- ; prog3.asm
- ; * * * * * * * * * * * * * * *
- INCLUDE \PUSHREGS.MAC
- COMSEG SEGMENT PUBLIC 'CODE'
-
- ASSUME cs:COMSEG, ds:COMSEG, es:COMSEG, ss:COMSEG
-
- PUBLIC message3, subroutine_b
- EXTRN message1:BYTE, message2:BYTE
- ; - - - - - - - - - -
- subroutine_b proc NEAR
-
- PUSHREGS ax, dx
- mov ah, 9 ; int 21h, ah = 9
- mov dx, offset message1
- int 21h
-
- mov ah, 9 ; int 21h, ah = 9
- mov dx, offset message2
- int 21h
-
- POPREGS ax, dx
-
-
-
-
- The PC Assembler Tutor 224
- ______________________
-
- ret
-
- subroutine_b endp
-
- message3 db "This is from subroutine B.", 13, 10, "$"
-
- COMSEG ENDS
- ; * * * * * * * * * * * * * * *
- END
- ; ----------
-
- If you look at them, you will see that all they do is print
- messages; sometimes from their own file, sometimes from external
- files. The subprograms all have different names since they call
- each other. Of course, they have both PUBLIC and EXTRN
- statements. Only prog1 has:
-
- start:
-
- since that is where the program execution will start, and only
- prog1 has:
-
- ORG 100h
-
- since having it in the other files would leave unnecessary blank
- spaces in the other programs. Assemble the programs. When you
- link the programs, prog1 MUST be the first on the line:
-
- link prog1+prog2+prog3
- link prog1+prog3+prog2
-
- are both ok, but:
-
- link prog2+prog1+prog3
-
- will not work since the linker, exe2bin, and the loader are
- counting on the starting instruction being at 100h. If you change
- the order, you will either get complaints from EXE2BIN or the
- program won't run correctly. Now with:
-
- exe2bin prog1.exe prog1.com
-
- you have a .COM file. Try it out.
-
-
- Any .COM file can also be made into an .EXE file. We will make a
- simple program in .COM format, and then add the necessary things
- to make it an .EXE format. First, here's the .COM format.
-
-
- ; commode.asm
- ; * * * * * * * * * * * * * * *
- COMSEG SEGMENT PUBLIC 'CODE'
-
- ASSUME cs:COMSEG, ds:COMSEG, es:COMSEG, ss:COMSEG
-
- main proc NEAR
-
-
-
-
- Chapter 21 - .COM Files 225
- _______________________
-
-
- ORG 100h
- start:
- ; get video mode int 10h, function 0Fh
- mov ah, 0Fh
- int 10h
-
- ; al = display mode
- mov bl, 10 ; divide by 10
- mov si, offset ones ; right hand digit
- mov cx, 2 ; 2 digit answer
- division_loop:
- mov ah, 0 ; clear ah
- div bl ; al/bl, remainder in ah
- add ah, '0' ; change to ascii
- mov [si], ah
- dec si ; one byte to the left
- loop division_loop
-
- ; display string
- mov dx, offset message ; int 21h, service 9
- mov ah, 9
- int 21h
-
- ret
-
- main endp
-
- ; - - - - START DATA BELOW THIS LINE
- message db "The current video mode is "
- ones db ?
- db ".", 13, 10, "$"
- ; - - - - END DATA ABOVE THIS LINE
-
- COMSEG ENDS
- ; * * * * * * * * * * * * * * *
- END start
- ; - - - - - - - - - -
-
- This program gets the video mode which is a number which tells
- you what mode the monitor is operating in. To find out what the
- number means, consult either of those two hardware books. It gets
- the mode through an interrupt. The mode is returned in AL. It
- then puts the number in a string and prints the string. We cannot
- link with asmhelp.obj, so it takes all this work is simply to
- output a number.
-
- We'll call this commode.asm. Assemble, link, use exe2bin, and run
- it. Now let's make the .EXE counterpart. Here it is.
-
- ; exemode.asm
- ; * * * * * * * * * * * * * * *
- STACKSEG SEGMENT STACK 'STACK'
-
- dw 20 dup (?)
-
- STACKSEG ENDS
-
-
-
-
- The PC Assembler Tutor 226
- ______________________
-
- ; - - - - - - - - - -
- COMSEG SEGMENT PUBLIC 'CODE'
-
- ASSUME cs:COMSEG, ds:COMSEG, es:COMSEG
-
- main proc FAR
-
- start:
- push ds ; same as before
- sub ax, ax
- push ax
-
- push cs ; ds = cs
- pop ds
-
- ; get video mode int 10h, service 15
- mov ah, 15
- int 10h
-
- ; al = display mode
- mov bl, 10 ; divide by 10
- mov si, offset ones ; right hand digit
- mov cx, 2 ; 2 digit answer
- division_loop:
- mov ah, 0 ; clear ah
- div bl ; al/bl, remainder in ah
- add ah, '0' ; change to ascii
- mov [si], ah
- dec si ; one byte to the left
- loop division_loop
-
- ; display string
- mov dx, offset message ; int 21h, service 9
- mov ah, 9
- int 21h
-
- ret
-
- main endp
-
- ; - - - - START DATA BELOW THIS LINE
- message db "The current video mode is "
- ones db ?
- db ".", 13, 10, "$"
- ; - - - - END DATA ABOVE THIS LINE
-
- COMSEG ENDS
- ; * * * * * * * * * * * * * * *
- END start
-
- This is almost the same. We have put in a small stack segment.
- Here's the different part:
-
- ;----------
- ASSUME cs:COMSEG, ds:COMSEG, es:COMSEG
-
- main proc FAR
-
-
-
-
- Chapter 21 - .COM Files 227
- _______________________
-
-
- start:
- push ds ; same as before
- sub ax, ax
- push ax
-
- push cs ; ds = cs
- pop ds
-
- ;----------
-
- SS is no longer in the ASSUME statement, since it now refers to
- the stack segment. CS, DS, and ES still refer to COMSEG. The
- procedure is now a FAR procedure. We have taken the ORG out,
- since that would simply waste 100h (256) bytes of space. Then we
- have the normal .EXE startup except the data is now in COMSEG, so
- we move CS to DS. That's it. We'll call this one exemode.asm.
- Assemble, link and run it. It should give you the same result.
-
- Here is the listing for both executable files:
-
- COMMODE COM 65 10-21-89 11:11p
- EXEMODE EXE 631 10-21-89 11:10p
-
- Notice how much bigger the .EXE file is. That is because the .EXE
- file has a bunch of information for the loader. Also, if you run
- both programs, the .COM file will start a little quicker. Those
- are the only advantages.
-
-
- PHASE ERRORS
-
- The assembler generates code in two steps. On the first pass, it
- calculates the address of each variable and machine instruction
- without actually writing code. For instance, if there is the
- instruction:
-
- mov ax, variable1
-
- The assembler will allocate 4 bytes for the instruction. If
- variable1 has already been defined, but is in ES, then the
- assembler will allocate 5 bytes; 1 for the segment override and 4
- for the instruction itself. If, however, variable1 has not been
- defined yet (it appears later in the code), then the assembler
- will assume that it is in DS and allocate 4 bytes. If it turns
- out that it is later defined to be in ES, then when the assembler
- generates code, it will write 5 bytes, 1 for the override and 4
- for the instruction. But this means that EVERYTHING after this
- instruction will have been shifted one byte, so EVERYTHING after
- the instruction will be at the wrong address. The assembler will
- detect this and print out a PHASE ERROR. This means that the
- machine code is garbage.
-
- By having all four segment registers in the ASSUME statement of a
- .COM file, you guarantee that the assembler will not generate
- segment overrides.
-
-
-
-
-
- The PC Assembler Tutor 228
- ______________________
-
- add dx, [bp]
-
- will have BP relative to SS.
-
- sub variable1, si
-
- will have variable1 relative to DS. This will go a long way
- towards eliminating errors in a .COM file.
-
-